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Abstract 


x All  failure  detection  methods  are  based,  either  explictly  or  implicitly 
on  the  use  of  redundancy,  that  is  on  (possibly  dynamic)  relations  among  the 
measured  variables.  Consequently  the  robustness  of  the  failure  detection  pro¬ 
cess  depends  to  a  great  degree  on  the  reliability  of  the  redundancy  relations 
given  the  inevitable  presence  of  model  uncertainties.  In  this  paper  we  address 
the  problem  of  determining  redundancy  relations  which  are  optimally  robust 
in  a  sense  which  includes  the  major  issues  of  importance  in  practical  failure 
detection  and  which  provides  us  with  a  significant  amount  of  intuition 
concerning  the  geometry  of  robust  failure  detection.  In  addition,  w«  provide 
a  procedure,  involving  the  construction  of  a  single  matrix  and  the  computation 
of  its  singular  value  decomposition,  for  the  determination  of  a  complete 
sequence  of  redundancy  relations  ordered  in  terms  of  their  level  of  robustness. 
This  procedure  also  provides  the  basis  for  comparing  robust  levels  of 
redundancy  provided  by  different  sets  of  sensors. 
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I .  I nt  roduc  ti on 

In  recent  years  a  wide  variety  of  techniques  has  been  proposed  for 
the  detection,  isolation,  and  accommodation  of  failures  in  dynamic  systems 
(see,  for  example,  the  surveys  in  [1,4]  and  the  numerous  papers  in  the 
more  recent  literature) .  Some  of  these  methods  have  been  developed  starting 
from  general,  abstract  dynamic  models,  while  others  have  been  produced  in 
the  context  of  particular  applications.  While  the  general  methods  provide 
the  basis  for  what  in  principle  should  be  a  widely  applicable  failure  detection 
methodology,  their  very  generality  often  tends  to  obscure  (or  at  best  fail 
to  highlight)  the  important  concepts  that  must  be  considered  in  the  design 
of  practical  and  reliable  failure  detection  systems.  On  the  other  hand, 
while  the  methods  that  have  been  developed  for  specific  applications  may  directly 
address  these  basic  concepts,  this  is  typically  done  in  a  very  problem-specific 
manner  which  makes  it  difficult  to  separate  out  those  aspects  of  the  design 
that  can  be  generalized  and  those  that  cannot. 

As  a  result,  there  does  not  at  present  exist  a  satisfactory  general  design 
methodology  for  robust  failure  detection  algorithms.  The  general 
approaches  to  failure  detection  that  have  been  developed  take  as  their  starting 
point  mathematical  models  of  both  the  system  under  consideration  and  of  the  types 
of  failures  that  may  occur.  However,  if  one  attempts  to  use  one  of  these 
approaches  in  a  top-down  or  "canned"  manner  in  which  one  generates  the 
requisite  overall  models  and  then  essentially  plugs  them  in  to  the  approach  chosen, 
the  likely  result  will  be  a  failure  detection  algorithm  that  does  not  work 
satisfactorily.  The  typical  reason  for  this  is  the  presence  of  discrepancies 
between  the  behavior  of  the  actual  system  and  that  predicted  by  the  model  on 
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which  the  detection  and  isolation  algorithm  is  based. 

The  explanation  for  this  sensitivity  to  model  uncertainty  is  relatively 
simple.  In  one  way  or  another,  all  failure  detection  methods  generate  signals 
which  tend  to  highlight  the  presence  of  particular  failures  if  they  have 
actually  occurred.  Consequently,  if  any  model  uncertainties  have  effects 
on  the  observables  which  are  at  all  like  those  of  one  or  more  of  the 
failure  modes,  these  will  also  be  accentuated.  From  this  perspective  we 
see  that  the  issue  of  robustness  for  failure  detection  is  fundamentally 
different  from  the  issue  of  robustness  in  filtering  and  control.  In  parti¬ 
cular,  the  goals  of  filtering  and  control  are  typically  to  keep  error 
signals  small  and  also  to  attenuate  high  frequency  effects.  On  the  other  hand, 
the  goal  of  failure  detection  is  to  accentuate  particular  error  signals  and  in 
fact  to  amplify  the  transient,  high  frequency  portions  of  these  signals  (in 
order  to  minimize  detection  delay) .  Consequently,  one  would  expect  that  very 
different  approaches  would  be  needed  to  design  robust  failure  detection 
systems  which  must  be  maximally  sensitive  to  some  effects  (failures)  and 
minimally  sensitive  to  others  (model  errors) . 

One  approach  to  solving  this  problem  is  to  attempt  to  compensate  the 
detection  algorithm  by  estimating  uncertainties  on-line  or  by  attempting 
to  detect  such  uncertainties  and  distinguish  them  from  failures  as  part  of 
the  detection  algorithm  [6,  7,  12).  The  other  alternative  is  to  attempt 
to  directly  design  a  failure  detection  system  which  is  insensitive  to  model 
errors.  The  work  described  here  focuses  on  the  latter  alternative.  The 
initial  impetus  for  our  approach  came  from  the  work  reported  in  [5,  13],  which 
document  the  first  and  to-date  by  far  most  successful  application  and  flight 


-3- 


testing  of  a  failure  detection  algorithm  based  on  advanced  methods  using 
analytic  redundancy.  The  singular  feature  of  that  project  was  that  advanced 
methods  were  not  applied  in  a  purely  top-down  manner.  Rather,  the  dynamics 
of  the  aircraft  were  decomposed  in  order  to  analyze  the  relative  reliability 
of  each  individual  source  of  potentially  useful  failure  detection  information. 
In  this  way  a  design  was  developed  that  utilized  only  the  most  reliable 
information. 

In  [2]  we  presented  the  results  of  our  initial  attempt  to  extract  the 
essence  of  the  method  used  in  [9,  13]  in  order  to  develop  a  general  approach 
to  robust  failure  detection.  As  discussed  in  those  references  and  in  others 
(such  as  [3,  7-9]),  all  failure  detection  systems  are  based  on  exploiting 
analytical  redundancy  relations  or  (generalized)  parity  checks.  These  are 
simply  functions  of  the  temporal  histories  of  the  measured  quantities  which 
have  the  property  that  they  are  small  (ideally  zero)  when  the  system  is 
operating  normally.  As  we  discuss  in  the  next  section,  essentially  all  of 
the  recently-developed  general  detection  methods  make  implicit,  rather  than 
explicit  use  of  all  of  these  relations,  and  for  this  reason  a  top-down 
application  of  any  of  these  methods  mixes  together  information  of  varying 
levels  of  reliability.  What  would  clearly  be  preferable  would  be  a  general 
method  for  explicitly  identifying  and  utilizing  only  the  most  reliable  of  the 
redundancy  relations.  Several  researchers  [2,  3,  7-9]  have  discussed  methods 
for  specifying  all  possible  redundancy  relations  for  a  given  model  (see  [3]), 
but  the  problem  remains  of  finding  the  most  reliable  of  these  relations 
given  the  presence  of  uncertainties.  One  criterion  for  measuring  the 
reliability  of  a  particular  redundancy  relation  was  presented  in  [2]  and  was 
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used  to  pose  an  optimization  problem  to  determine  the  most  reliable  relation. 

This  criterion  has  the  feature  that  it  specifies  robustness  with  respect  to 
a  particular  operating  point  thereby  allowing  the  possibility  of  adaptively 
choosing  the  best  relations.  However  a  potential  drawback  of  this  approach 
is  that  it  leads  to  an  extremely  complex  optimization  problem.  Moreover, 
if  one  is  interested  in  obtaining  a  list  of  redundancy  relations  in  order 
from  most  to  least  reliable,  one  must  essentially  solve  a  separate  optimi¬ 
zation  problem  for  each  relation  in  the  list. 

In  this  paper  we  look  at  an  alternative  measure  of  reliability  for  a 
redundancy  relation.  Not  only  does  this  alternative  have  a  helpful  geo¬ 
metric  interpretation,  but  it  also  leads  to  a  far  simpler  optimization 
procedure  involving  only  one  singular  value  decomposition.  In  addition,  it  Allows 
us  in  a  natural  and  computationally  feasible  way  to  consider  issues  such  as 
scaling,  relative  merits  of  alternative  sensor  sets,  and  explicit  tradeoffs 
between  detectability  and  robustness. 

In  the  next  section  we  review  the  notion  of  analytic  redundancy  for 
perfectly  known  models  and  provide  a  geometric  interpretation  which  forms  the 
starting  point  for  our  investigation  of  robust  failure  detection.  Section  III 
addresses  the  problem  of  robustness  using  our  geometric  ideas,  and  in  that 
section  we  pose  and  solve  a  first  version  of  the  optimum  robust  redundancy 
problem.  In  Section  IV  we  discuss  extensions  to  include  three  important 
issues  not  included  in  Section  III:  scaling,  noise,  and  the  detection/robustness 
tradeoff.  Our  approach  is  illustrated  with  an  example  in  Section  V,  and  we 
make  acme  concluding  remarks  in  Section  VI. 
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II.  Redundancy  Relations 

In  this  paper  we  focus  attention  on  linear,  discrete-time  systems, 
and  in  this  section  we  consider  the  noise-free  model 


x(k+l)  =  Ax(k)  +  Bu(k) 

(1) 

y (k)  =  Cx(k) 

(2) 

where  x  is  n-dimensional ,  u  is  m-dimensional,  y  is  r-dimensional ,  and  A,  B, 
and  C  are  perfectly  known.  A  redundancy  relation  for  this  model  is  some  linear 
combination  of  present  and  lagged  values  of  u  and  y  which  should  be  identically 
zero  if  no  changes  (i.e.  failures)  occur  in  (1),  (2).  As  discussed  in  [2], 
redundancy  relations  can  be  specified  mathematically  in  the  following  way. 

The  subspace  of  (p+l)r-dimensional  vectors  given  by 


P  A 
P  = 


C 

' 

w|o)' 

CA 

=  0 

CAP 

(3) 


is  called  the  space  of  parity  or  redundancy  relations  of  order  g.  The  reason 
for  this  terminology  is  the  following.  Suppose  that  u>  6  P^.  Then  (1)  -  (3) 
imply  that  if  we  partion  a>  into  (p+1)  subvectors  of  dimension  r 


0)'  =  [oi '  . . . ,  0)'  ] 

0  p 


(4) 


then  at  any  time  k 

Pi  i-1  '  _  •  . 

r(k)  =  [y(k-p+i)  -  CA1  3  Bu(k-p+j)]  =  0  (5) 

The  quantity  r(k)  is  called  a  parity  check.  A  simpler  form  for  (5)  (which 
we  will  use  later)  can  be  written  in  the  case  when  u  »  0  (or,  equivalently, 
if  the  effect  of  the  inputs  are  subtracted  from  the  observations  before 
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computing  the  parity  check) .  In  this  case 


r  (k)  =  0)' 


y(k-p) 

y (k-p+1) 
y'(k) 


(6) 


To  illustrate  the  notion  of  parity  relations,  consider  a  system  in 
which  we  observe  both  a  velocity  and  an  acceleration  variable.  Let  y  (k) 
denote  the  output  of  the  velocity  sensor  and  y 2 (k)  the  output  of  the 
acceleration  sensor.  If  the  sampling  rate  is  high  enough  (so  that 

acceleration  is  essentially  constant  over  time  intervals  of  length  T) , 
an  obvious  parity  check  (of  order  1)  is 


r (k)  =  y  (k)  -  y  (k-1)  -  Ty2<k-1)  (7) 

Note  that  this  is  a  valid  parity  check  under  the  stated  assumptions  even 
if  the  velocity  and  acceleration  variables  are  embedded  in  a  far  more 
complex  system  (for  example  if  these  are  sensors  measuring  variables  of  one 
part  of  a  large  mechanical  system  or  if  drag  and  damping  effects  are  present) . 
The  importance  of  this  point  is  made  clear  in  what  follows. 

To  continue  our  development,  let  us  assume  that* 


oj  /  0  (8) 

P 


Let  us  denote  the  components  of  UK  as 


co .  *  [w,  ...,0l),  ] 

x  xl,  lr 


(9) 


Since  at  least  one  element  of  U)  is  nonzero,  we  can  normalize  u>  so  this 

P 

component  has  unity  value.  In  order  to  illustrate  several  points,  let  us 


assume  that  the  first  component,  <o  ,  *  1.  In  this  case  (5)  can  be 

Pi 


If  (0  »  0 ,  then  the  parity  relation  is  actually  of  order  p-1,  since 

[u|),..  e  pp_!- 


•T‘ 
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rewritten  .  i 


p-1  p  r 

yl(k)  =  ~  i=0  wil  yi(k"P"i!  -  i^o  s=2  “is  ys(k-P+i) 


p  i-1 

+  . S  CA1"3-1  Bu (k-p+ j ) ]  =  0 

1=0  j=0  1 


(10) 


In  our  example  (10)  reduces  to 


yi(k)  =  yi(k-1)  +  Ty2(k-1) 


(11) 


There  are  two  very  important  interpretations  of  (10) .  The  most  obvious 
is  that  the  right-hand  side  of  this  equation  represents  a  synthetic  measurement 
which  can  be  directly  compared  to  y^(k)  in  a  simple  comparison  test.  The 
second  interpretation  of  (10)  is  as  a  reduced-order  dynamic  model.  Specifically 
(this  equation  is  nothing  but  an  autoregressive-moving  average  (ARMA)  model 
for  y^(k).  That  is  y^Ck)  is  expressed  in  terms  of  its  own  lagged  values  and 
of  the  present  and  past  values  of  a  set  of  exogenous  variables,  namely  the 
remaining  sensor  outputs  y2,...,yj,  and  the  input  u.  (From  the  point 
of  view  of  the  evolution  of  y^  according  to  (10),  y2, . .. ,yr  and  the  components 
of  u  are  all  regarded  as  inputs).  Equation  (11)  makes  this  point  quite  clear, 
as  y^  satisfies  a  first-order  difference  equation  driven  by  the  measurement 
of  acceleration.  As  this  measurement  would  of  necessity  capture  all  sources 
of  acceleration  or  deceleration  (e.g.  thrust  and  drag) ,  damping  terms  (such 
as  drag)  do  not  appear  explicitly. 

This  second  interpretation,  which  views  a  parity  relation  as  a  reduced- 
order  dynamic  model,  allows  us  to  make  contact  with  the  numerous  existing 


failure  detection  methods.  Typically  such  methods  are  based  on  a  noisy 
version  of  the  model  (1) ,  (2)  representing  normal  system  behavior 
together  with  a  set  of  deviations  from  this  model  representing  the  several 
failure  modes.  Rather  than  applying  such  methods  to  a  single,  all-encompassing 
model  as  in  (1) ,  (2) ,  one  could  alternatively  apply  the  same  techniques  to 
individual  models  as  in  (10) ,  (or  a  combination  of  several  of  these) ,  thereby 
isolating  individual  (or  specific  groups  of)  parity  relations.  For  example, 
this  is  precisely  what  was  done  in  [5,  13].  The  advantage  of  such  an 
approach  is  that  it  allows  one  to  separate  the  information  provided  by 
redundancy  relations  of  differing  levels  of  reliability,  something  that  is 
not  easily  done  when  one  starts  with  the  overall  model  (1) ,  (2)  which 
combines  all  redundancy  relations. 

In  the  next  two  sections  we  address  the  main  problem  of  this  paper, 
which  is  the  determination  of  optimally  robust  redundancy  relations.  The 
key  to  this  approach  is  obtained  by  re-examining  (3) .  Specifically,  from 
this  equation  we  see  that  (i)  6  if  and  only  if  W  is  orthogonal  to  the  range 
cf  the  matrix 


This  suggests  a  geometrical  interpretation  of  parity  relations.  In  particular, 
consider  the  model  (1) ,  (2)  with  u  =  0,  and  let  2  denote  the  range  of  the 
matrix  in  (12) .  Then  a  complete  set  of  parity  relations  of  order  p  is  given 
by  the  orthogonal  projection  of  the  window  of  observations  y(k),  y (k— 1) , . . . ,y (k-p) 
onto  the  orthogonal  complement,  G,of  Z.  To  illustrate  this,  consider  an 


example  in  which  the  first  two  components  of  y  measure  scaled  versions  of 
the  same  variable ,  i.e. 

y2  =  ay1  (13) 

Then,  as  illustrated  in  Figure  1,  in  -  y^  space  the  subspace  Z  is  simply 
the  line  specified  by  Eq.  (13).  Furthermore,  in  this  case  the  obvious 
parity  relation  is 

r  =  y2  -  oty]L  (14) 

which  is  nothing  more  than  the  orthogonal  projection  of  the  observed  pair  of 
values  y^  and  y 2  onto  the  line  G  perpendicular  to  Z  (Figure  1) .  For  inter¬ 
pretations  of  the  space  in  purely  matrix  terms  and  in  terms  of  polynomial 
matrices  we  refer  the  reader  to  [9]  and  [3] ,  respectively.  It  is  the 
geometric  interpretation,  however,  which  we  will  utilize  here. 
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Figure  1:  An  Example  of  the  Geometric  Interpretation  of 
Parity  Relations. 


-11- 


III.  An  Anqular  Measure  of  Pcubstness 

consider  a  model  containing  imperfectly  known  parameters  n ,  process 
noise  w  and  measurement  noise  v: 


x  (k+1)  =  A(n)x(k)  +  B(tl)u(k)  +  w  (k)  (15) 

y(k)  =  C(n) x(k)  +  v(k)  (16) 

where  h  is  a  vector  of  unknown  parameters  and  where  the  matrices  A,  B,  C 

4. 

and  the  covariances  of  w  and  v  are  functions  of  n  .  Let  K  denote  the  set 

of  possible  values  which  tl  can  take  on.  In  their  work  [2]  Chow  and  Willskv 

used  the  following  line  of  reasoning.  If  the  parameters  of  the  system 

were  known  perfectly  and  if  there  were  no  process  or  measurement  noises, 

then  according  to  (5)  we  could  find  a  vector  00 '  =  [tii co '  ]  and  a  vector 

OP 

U  =  [yA, . . • ,U  , ]  with 

0  p-1 

m\  -  .  Z  ,  CO.'  CA3  B  (17) 

1  :=i+i  1 


so  that 


p  P-1 

r(k)  =  .Z  co.'y  (k-pt-i)  —  .Z  h!  u(k-p+l)  =  0  (18) 

1  — Q'  1  1—0  1 

In  the  uncertain  case,  what  would  seem  to  make  sense  is  to  minimize  some 
measure  of  the  size  of  r(k) .  For  example  one  could  consider  choosing  the  co 
that  solves  the  minimax  problem 


min 

II*  i!  =  1 


max  E  [r(k)]2 

P6K  x  (n) 


(19) 


Note  that  with  this  formulation  of  model  uncertainty  one  can  incorporate 
the  possibility  of  neglected  dynamics  by  state  augmentation,  if  one  has 
an  upper  bound  on  the  order  of  this  dynamics. 
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where  the  constraint  [|  to  |j  =  1  is  made  to  avoid  the  trivial  solution  of 
oo  =  u  =  o.  Here  the  expectation  is  taken  for  each  value  of  n  and  assuming 
that  the  system  is  at  a  particular  operating  point,  i.e.  that  u(k)  -  uq 
and  that  xQ(h)  is  the  corresponding  set  point  value  of  the  state,  i.e. 

x  (n)  =  a (n)  x  (n)  +  B(ri)u  { 20 ) 

o  00 

This  criterion  has  the  interpretation  of  finding  the  approximate  parity 
relation  which,  at  the  specified  operating  point,  produces  the  residual  with 
the  smallest  worst-case  mean-square  value  when  no  failure  has  occurred. 
Alternatively,  one  could  consider  a  less  conservative  criterion  by  replacing 
the  worst  case  maximization  over  h  by  a  weighted  integral  over  h ,  where  the 
weighting  function  can  alternatively  be  thought  of  as  a  probability  distri¬ 
bution  over  n. 

Let  us  make  several  comments  concerning  the  procedure  just  described. 

In  the  first  place  the  optimization  problem  (19)  is  a  complex  nonlinear 
programming  problem.  Furthermore,  the  method  does  not  easily  give  a 
sequence  of  parity  relations  ordered  by  their  robustness.  One  can,  of  course, 
obtain  such  a  sequence,  but  at  substantial  computational  cost.  In  particular 
if  is  the  solution  to  one  of  these  optimization  problems  one  can  then 
solve  for  the  next  best  parity  relation  by  re-solving  the  optimization  problem 
with  the  additional  constraint  that  the  solution  must  be  orthogonal  to  the 
previously  determined  relation,  i.e.  =  0.  Clearly  this  process  can  be 

iterated  but  at  each  stage  we  have  an  optimization  problem  of  essentially 
the  same  level  of  difficulty  as  the  original  one.  Finally  the  optimum  parity 


/ 


relation  clearly  depends  upon  the  operating  point  as  specified  by  and 

x  (n) .  In  some  problems  this  may  be  desireable  as  it  does  allow  one  to 
o 

adapt  the  failure  detection  algorithm  to  changing  conditions  (although 
it  requires  solving  the  optimization  problem  for  every  likely  operating 
regime) ,  but  in  others  it  might  be  acceptable  or  preferable  to  have  a 
single  set  of  parity  relations  for  all  operating  conditions.  The  approach 
developed  in  this  paper  produces  such  a  set  and  results  in  a  far  simpler 
computational  procedure. 

To  begin,  let  us  focus  on  the  noise-free,  undriven  model 


x(k+l)  =  A(p)x(k) 

(21) 

y  (k)  =  C(n)x(k) 

(22) 

Referring  to  the  previous  discussion,  we  note  that  it  is  in  ceneral  impossible 
to  find  parity  checks  which  are  perfect  for  all  possible  values  of  n .  That  is, 
in  general  we  cannot  find  a  subspace  G  which  is  orthogonal  to 


Z(n)  =  Range 


C(n) 

COl)A(h) 

c(n)A(n)p 


(23) 


for  all  h.  What  would  seem  to  make  sense  in  this  case  is  to  choose  a  subspace 
G  which  is  "as  orthogonal  as  possible"  to  all  possible  Z(n).  Returning  to 
our  simple  example,  suppose  that  y2  =  ay^  but  a  is  not  known  precisely.  Rather, 


what  we  do  know  is  that 


a  ,  <  a  <  a 

min  —  —  max 


(24) 
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In  this  case  we  obtain  the  picture  shown  in  Figure  2.  Here  the  shaded 
regions  represents  the  range  of  (y^,  y^)  values  consistent  with  y^  =  ay^ 
and  with  (24) .  Intuitively  what  would  seem  to  be  a  good  choice  for  G 
(assuming  that  a  is  equally  likely  to  lie  anywhere  in  the  interval  (24) 
is  the  line  which  bisects  the  obtuse  angle  made  by  the  shaded  sector  in 
Figure  2.  It  is  precisely  this  geometric  picture  which  is  generalized 
and  built  upon  in  this  paper. 

In  particular,  one  natural  generalization  of  the  concept  depicted 
in  Figure  2  is  obtained  by  noting  that  G  in  this  figure  is  the  line  which 
maximizes  the  minimum  angle  between  itself  and  any  line  in  the  shaded 
sector.  In  general  one  can  extend  this  idea  by  defining  the  cosine  of  the 
"angle"  between  two  subspaces  H  and  M  as  the  maximum  inner  product  of  a  unit 
vector  in  H  with  a  unit  vector  in  M-  An  equivalent  definition  which  we  will 
find  useful  is  that  the  cosine  of  the  angle  between  H  and  M  equals  the  maximum 
magnitude  of  the  projection  of  any  unit  vector  in  H  onto  M,  i.e. 

cos  (f  H,M)  =  sup  II  P  y  I!  (25) 

y€H  M 

II  y  II  =  i 

where  P^  is  the  orthogonal  projection  onto  M.  As  shown  in  Appendix  A.l, 

(see  also  [16]),  if  we  use  the  same  symbols  H  and  M  to  denote  matrices  whose 

X 

columns  form  orthonormal  bases  for  the  corresponding  subspaces,  then 

cos  (*  (H,M) )  =  0^  (H'M)  (26) 


From  this  point  we  will  use  the  same  symbol  to  denote  a  subspace  and  a  matrix 
whose  columns  form  an  orthonormal  basis  for  the  subspace. 
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where  is  maximum  singular  value  of  H'M  (see  Appendix  A.O.). 

Note  that  this  definition  of  the  angle  between  subspaces 
has  the  property  that,  if  H/1  M  is  larger  than  { o) ,  the  anqle 
between  H  and  M  is  zero.  Thus  the  angle  between  any  two,  distinct  two- 
dimensional  subspaces  in  3-space  is  zero,  as  the  intersection  of  these 
spaces  is  a  line.  While  this  may  at  first  glance  appear  troublesome,  it 
makes  good  sense  for  the  problem  at  hand.  Recall  that  computing  parity 
checks  corresponds  to  projecting  onto  a  chosen  subspace  (say,  M)  the  most 
recent  history  of  output  values  which  under  normal  conditions  take  values 
in  a  second  subspace  (H) .  If  Hrt  M  ¥  fo^,  then  for  some  output  histories 
within  normal  limits  one  will  find  that  one  or  more  of  the  computed  parity 
checks  will  be  large. 

Returning  now  to  the  problem  of  determining  robust  parity  checks  we 
see  that  choosing  a  subspace  G  to  maximize  the  minimum  angle  (or  equivalently 
to  minimize  the  maximum  cosine  of  the  angle)  between  it  and  Z(ri)  as  n  ranges 
over  K  is  equivalent  to 

2 

min  max  0  (G'Z(ri))  (27) 

G'G  =  I  neK  MAX 


Here  the  condition  G'G  =  I  simply  ensures  that  the  columns  of  G  form  an 
orthonornal  basis  for  G.  Furthermore,  once  we  have  obtained  the  solution 
to  (27) ,  the  optimum  set  of  parity  relations  is  obtained  as 


y(k-p) 


G' 


y(k-l) 


LyOO  J 


(28) 


'■■We  note  that  the  actual  projection  onto  G  is  given  by  GG'  (see  Appendix 
A.l);  however,  if  dim  G  *  s,  then  all  the  multiplication  by  G  does  is  to 
coordinatize  the  set  of  parity  checks  (the  s  components  of  (28))  in  the 
higher-dimensional  space  in  which  G  sits  as  a  subspace. 


r 
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Thus  the  rows  of  G'  correspond  to  individual  parity  relations  as  in  (4) . 

While  the  problem  just  stated  has  a  simple  and  conceptually  useful 

geometric  interpretation,  it  suffers  from  several  drawbacks  and  limitations 

which  we  address  in  the  remainder  of  this  and  the  next  section.  The  first 

is,  that,  although  the  criterion  explicitly  involves  singular  values,  whose 

calculation  is  relatively  easy,  the  minimax  problem  (27)  represents  an 

extremely  complex  nonlinear  programming  problem  (on  the  same  order  of 

difficulty  as  that  investigated  in  [2]).  One  can  improve  things  somewhat 

by  considering  the  less  conservative  criterion  obtained  by  replacing  the 

worst-case  maximization  over  r)  in  (27)  with  an  expectation  over  n  [3]; 

however  the  resulting  formulation  is  still  a  complex  nonlinear  programming 

problem.  On  the  other  hand,  if  we  consider  a  variation  on  this  idea  we 

obtain  a  far  simpler  problem  which  also  has  other  important  advantages.  To 

do  this,  however,  we  must  make  the  assumption  that  K,  the  set  of  possible 

values  of  h, is  finite.  Typically  what  this  would  involve  is  choosina 

representative  points  out  of  the  actual,  continuous  range  of  parameter 

values.  Here  "representative"  means  spanning  the  range  of  possible  values 

and  having  density  variations  reflecting  any  desired  weightings  on  the 

likelihood  or  importance  of  particular  sets  of  parameter  values.  For 

the  example  in  Figure  2  this  would  correspond  to  choosing  a  finite  set 

of  values  of  a  between  a  .  and  a  .  If  all  values  of  a  are  equally  likely, 

min  max 

the  samples  chosen  would  be  uniformly  spaced;  however  if  a^n  represented  the 
more  likely  extreme  or  the  one  which  we  view  as  the  most  critical,  we  would 
choose  a  higher  density  of  points  near  this  value.  However  this  is  accomplished. 
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we  will  assume  for  the  remainder  of  this  paper  tnat  n  takes  on  a  discrete 

set  of  values  h  =  1,...,L,  and  will  use  the  notation  A.  for  A(f|  =  i)  ,  Z. 

1  1 

for  Z (n  =  i) ,  etc. 

To  obtain  a  simpler  computational  procedure  for  determining  robust 

redundancy  relations,  we  proceed  as  follows.  Rather  than  computing  the  anale 

of  G  with  and  choosing  G  to  maximize  this  on  the  average,  we  reverse 

these  two  steps:  We  first  compute  an  average  observation  subspace  Z^  which 

is  as  close  as  possible  to  all  of  the  Z^  and  we  then  choose  G  to  be  the 

orthogonal  complement  of  Z^.  This  idea  is  also  illustrated  in  Figure  2, 

where  the  average  observation  space  Zq  is  depicted  as  the  line  which  bisects 

the  shaded  region,  and  the  line  G  then  represents  its  orthogonal  complement. 

In  the  general  case  let  us  first  note  that  the  Z^  are  subspaces  of 

possibly  differing  dimensions  embedded  in  a  space  of  dimension  N  =  (p+l)r 

corresponding  to  histories  of  the  last  p+1  values  of  the  r-dimensional 

output.  Consequently,  if  we  would  like  to  determine  the  s  best  parity 

checks  (so  that  dim  G  =  s) ,  we  would  equivalently  like  to  find  a  subspace 

Z  of  dimension  N-s.  We  define  a  criterion  for  the  best  choice  of  Z  in 
o  o 

the  following  manner.  Let  Z, , — ,Z,  denote  matrices  of  sizes  N  x  v. , 

1  L  i 

i  =  1 , . . .  ,L  (where  \K  =  dim  Z/)  whose  columns  form  orthonormal  bases  for 
the  corresponding  subspaces,  and  let  M  =  v  +...+V  .  Define  the  NXM  matrix 

z  =  [z.:  z zi  (29) 

Thus  the  columns  of  Z  represent  directions  in  which  observation 
histories  may  lie  under  normal  conditions.  The  optimum  choice  for  Z  is 
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then  taken  to  be  the  span  of  the  (not  necessarily  orthonormal)  columns  of 

the  matrix  Z  which  minimizes 
o 


subject  to  the  constraint  that  rank  Z^  =  N-s.  Here  ||»||  denotes  the 
Frobenius  norm  which  is  defined  as 

II  clip  =  £  E  I 1 2  (3D 

j  i 

Thus  the  matrix  Z^  is  chosen  so  that  the  sum  of  the  squared  distances 

between  the  columns  of  Z  and  of  Z  is  minimized  subject  to  the  constraint 

o 

that  Z^  have  only  N-s  linearly  independent  columns . 

There  are  several  important  reasons  for  choosing  this  criterion,  one 
being  that  it  does  produce  a  space  which  is  as  close  as  possible  to  a  specified 
set  of  directions  (in  fact,  the  importance  of  this  will  be  made  even  more 
clearly  in  Section  4.1).  A  second  is  that  the  resulting  optimization  problem 
is  easy  to  solve.  In  particular  let  the  singular  value  decomposition  (see 
Appendix  A.O)  of  Z  be  given  by 

Z  =  U  E  V  (32) 

where 


Here  ‘  °N  are  s^n<?u3ar  values  of  Z  ordered  by  magnitude.  Note  we 

have  assumed  N  <  M.  If  this  is  not  the  case  we  can  make  it  so  without  changing 


the  optimum  choice  of  Z^  by  padding  Z  with  additional  columns  of  zeros.  As 
shown  in  [17]  (see  also  [18]),  the  matrix  Z  minimizing  (30)  is  given  by 


o 

o 

1 _ 

— 

z  =  u 

o 

0 

-0  'CN 

_ 

Moreover,  since  the  columns  of  U  are  orthonormal,  we  immediately  see  that 
the  orthogonal  complement  of  the  range  of  Z^  is  given  by  the  first  s  left 
singular  vectors  of  ZQ,  i.e.  the  first  s  columns  of  U.  Consequently 

G  =  [u, ]  (35) 

1 .  •  s 

and  u, ,...,u  are  the  optimum  redundancy  relations. 

1  s 

There  is  an  alternative  interpretation  of  this  choice  of  G  which  provides 
some  very  useful  insight.  Specifically,  recall  that  what  we  wish  to  do  is  to 
find  a  G  whose  columns  are  as  orthogonal  as  possible  to  the  columns  of  the 

I 

Z^;  that  is,  we  would  like  to  choose  G  to  make  each  of  the  matrices  Z^G  as 
close  to  zero  as  possible.  In  fact,  as  shown  in  Appendix  A. 2,  the  choice  of 
G  given  in  (35)  minimizes 


J(s)  = 


Z’.gWI 

i  "F 


(36) 


yielding  the  minimum  value 


J(s)  =  . Z,  0. 

i=l  i 


(37) 


As  noted  in  the  Appendix,  the  same  choice  for  G  also  minimizes  other 


related  criteria,. which  yields  additional  insight. 
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There  are  three  important  points  to  observe  about  the  result  (36),  (37). 
The  first  is  that  we  can  now  see  a  straightforward  way  in  which  to  include 
unequal  weightings  on  each  of  the  terms  in  (36) .  Specifically,  if  the  w^ 
are  positive  numbers,  then 


.2.  w.  ||  z!g|!  I  =  .2.  ||.w.  z!g||  I 

1=1  l  "  l  "  F  1=1  l  l  F 


(38) 


so  that  minimizing  this  quantity  is  accomplished  using  the  same  procedure 

described  previously  but  with  Z^  replaced  by  »w.  Z^.  As  a  second  point 

note  that  the  optimum  value  (37)  provides  us  with  an  interpretation  of  the 

singular  values  as  measures  of  robustness  and  with  an  ordered  sequence  of 

parity  relations  from  most  to  least  robust:  u^  is  the  most  reliable  parity 
2 

relation  with  as  its  measure  of  robustness,  is  the  next  best  relation 
2 

with  cr2  as  its  robustness  measure,  etc.  Consequently  from  a  single 
singular  value  decomposition  we  can  obtain  a  complete  solution  to  the  robust 
redundancy  relation  problem  for  a  fixed  value  of  p,  i.e.  for  a  fixed  length 
time  history  of  output  values.  To  compare  relations  for  different  values 
of  p  it  is  necessary  to  solve  a  singular  value  decomposition  for  each ;  this 
is  illustrated  for  an  example  in  Section  V.  The  third  point  to  be  noted  is  that 
the  above  solution  does  not  depend  on  which  particular  orthonormal  basis 
Z^  is  chosen  for  the  i-th  subspace  above. 
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IV.  Several  Important  Extensions 

In  this  section  we  address  several  of  the  drawbacks  and  limitations  of 
the  result  of  the  preceding  section  and  obtain  modifications  to  this  result 
which  overcome  them  at  no  fundmental  increase  in  complexity. 

4.1  Scaling 

A  critical  problem  with  the  criteria  of  the  preceding  section  is  that 
all  vectors  in  the  observation  spaces  are  treated  as  being  equally  likely 
to  occur.  If  there  are  differences  in  scale  among  the  system  variables  this 
may  lead  to  poor  solutions  for  the  optimum  parity  relations.  To  see  this 
consider  a  simple  example  in  which  two  measurements  y^  and  y^  are  related 
to  two  state  variables  x^  and  x2  by 

Y1  =  xL  (39a) 

y2  =  *i  +  hx2  (39b) 

Suppose  that  x^  has  a  magnitude  of  order  1,  that  n  is  of  order  1, 
and  x2  is  of  order  10  6 .  It  is  clear  that  y^-y2  is  a  reasonable  parity 
check.  However,  the  previous  criteria  would  indicate  otherwise,  since  they 
implicitly  consider  all  possible  values  of  x^  and  x2  to  be  equally 
likely . 

To  overcome  this  drawback,  we  proceed  as  follows.  Suppose  that  we 
are  given  a  scaling  matrix  P  so  that  with  the  change  of  basis 

£  =  Px  (40) 


i 
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one  obtains  a  variable  £  which  is  equally  likely  to  lie  in  any  direction. 

For  example  if  covariance  analysis  has  been  performed  on  x  and  its  covariance  is  Q, 
then  P  can  be  chosen  to  satisfy 

Q  =  P“1(P’)“1  (41) 

and  the  resulting  covariance  of  £  is  the  identity.  Similarly,  if  one 
assumes  an  unknown-but-bounded  model  for  x  [10] 

x'  Sx  <  1  (42) 

then  the  appropriate  choice  of  P  is  such  that 

S  =  P'P  (43) 

As  a  next  step,  recall  that  what  we  would  ideally  like  to  do  is  to 
choose  a  matrix  G  (whose  columns  represent  the  desired  parity  relations) 
so  that 


G' 


"c. 

1 

c.p"1 

l 

C.A. 

1  i 

x  =  G* 

C.A.P-1 

l  i 

£  = 

C.A.P 

_  1  i  -1 

c.aPp-1 

i  i 

(44) 


is  as  small  as  possible.  In  the  preceding  section  we  considered  all  directions 
in  Z^  =  Range  (C^)  to  be  on  equal  footing  and  arrived  at  the  criterion  (36). 
Since  all  directions  for  £  are  on  equal  footing,  we  are  led  naturally  to  the 
following  criterion  which  takes  scaling  into  account 

J(s)  =  ill  IIc.-gII^  (45) 

As  in  (38),  we  cam  multiply  the  by  positive  scalars  to  take  into  account 
unequal  weightings  on  the  term  in  (45) . 
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Using  the  result  [17]  cited  in  the  previous  section  we  see  that  to 
find  the  Nxs  matrix  G  (with  orthonormal  oalumns)  which  minimizes  J(s)  we 
must  perform  a  singular  value  decomposition  of  the  matrix 

C  =  [C  :  C  ■  . .  . :  CT  ]  =  U  J  V  (46) 

2  2  2 

where  0  <  O  <...<  a  and  U  =  [u,  I  u_I...T  u  ]  .  Then  u.  is  the  best 

1  —  Z  —  —  N  1.  i.  .  N  1 

2 

parity  relation  with  0^  as  its  measure  of  robustness,  u^  is  the  next  best, 
etc. ,  and 

s  2 

J*(s)  =  .Z.  a.  (47) 

1=1  1 

Note  in  this  case  that  the  columns  of  C  represent  the  (not  necessarily 
orthogonal)  directions  in  which  the  observations  are  most  likely  to  lie. 
Finally,  in  anticipation  of  the  next  subsection,  suppose  that  we  use  the 
stochastic  interpretation  of  £,  i.e.  that 

E[££’]  =  I  (48) 

In  this  case  if  we  define  the  parity  check  vector 

Mi  =  G'O  £  (49) 

then 

E  [  1 1  u .  ||  2  ]  =  Et^’ClGG'C.  5]  =  E[tr(C.GG'C.CC  )  3 
1  1  1  11 

=  tr  (CIGG'C.  )  =  [|  C . G 1  |  1  (50) 

i  l  1  l  F 

4.2  Observation  and  Process  Noise 

In  addition  to  choosing  parity  relations  which  are  maximally  insensitive 
to  model  uncertainties  it  is  also  important  to  choose  relations  which  suppress 


noise.  Consider  then  the  model 
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x(k+l)  =  A.x(k)  +  D.w(k) 
i  1 


y (k)  =  C.x(k)  +  v (k) 


(51) 

(52) 


where  w  and  v  are  independent,  zero-mean  white  noise  processes  with  co- 
variances  Q  and  R,  respectively.  In  this  case  the  time  window  of  observation 
is  given  by 


y  (k) 

y (k+1) 

['*  1 
1 

x(k)  +  D. 

w  (k ) 

+ 

~v(k)  " 

v (k+p) 

eU 
1  1 

i 

w(k+p-l) 

V  (k+p) 

V  (k) 

L  J 

W  (k) 

V  (k) 

(53) 


wnere 


0 

0  . 

C.D. 

l  i 

0-. 

C.A.D. 

ill 

C.D. 
l  1 

C  .  pF~  1D  . 

C  .  AP" 

ii  l 


ii  i 


C.D. 
1  1 


(54) 


Assuming  x  and  y  have  attained  stationarity  and  writing  x(k)  =  P  £(K) 
for  an  appropriate  P,  we  obtain 


Y  (k)  =  CiC(k)  +  D.W(k)  +  V (k)  ,  Var  £  =  I 


(55) 


Using  the  interpretation  provided  in  (50)  ,  we  obtain  the  following 
natural  generalization  of  the  criterion  (45) : 

L 


J(s>  =  i|1  E[||  U.| 


(56) 
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=  G’Y(k) 


(57) 


when  Y(k)  is  generated  by  the  ith  model. 

Using  the  independence  of  g(k),  W(k),  and  V(k)  and  the  fact  that  £  has  the 
identity  as  its  covariance  we  find  that 


J(s)  =  .1,  '  C'  Gil  +  tr  (DlGG'D.Q)  +  tr  (GG'R)l 
i=l  l  '  F  l  l 


(58) 


where 


Q  = 


Q_  0 

,0  '0 


'R  0 
\0  ‘R  , 


(59) 


We  now  write 
L 


tr  (D!  GG'D.O)  +  tr(GG'R);  =  tr  (GG'N) 
i=l  l  l 


(60) 


where 


L 

N  =  .1,  D.QD!  +  LR 

1=1  l  l 


Let  S  be  a  matrix  such  that 
N  =  SS' 


(61) 


(62) 


Then  we  can  write 


J(s) 


I  <-.r  I' 
S  G  || 


(63) 


Consequently,  the  effect  of  the  noise  is  to  specify  a_  single  additional  set 
of  directions ,  namely  the  columns  of  S,  to  which  we  would  like  to  make  the 
columns  of  G  as  close  to  orthogonal  as  possible. 

From  this  it  is  evident  that  the  optimum  choice  of  G  is  computed  by 
performing  a  singular  value  decomposition  on  the  matrix 

IC.  I . .  .  :  C  ’  S)  =  U  1  V  (64) 

l .  .  L . 


with  a  <_  y>2  £.  .  . 


As  before, (64)  provides  a  comr.ete  set  of  parity  relations 
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ordered  in  terms  of  their  degrees  of  insensitivity  to  model  errors  and  noise. 
4.3  Detection  Versus  Robustness 

The  methods  described  to  this  point  involve  measuring  the  quality  of 
redundancy  relations  in  terms  of  how  small  the  resultina  parity  checks  are 
under  normal  operating  conditions.  That  is,  good  parity  checks  are  maximally 
insensitive  to  modeling  errors  and  noise.  However,  in  some  cases  one  mioht 
prefer  to  use  an  alternative  viewpoint.  In  particular  there  may  be  parity 
checks  which  are  not  optimally  robust  in  the  senses  we  have  discussed  but 
are  still  of  significant  value  because  thev  are  extremely  sensitive  to 
particular  failure  modes.  In  this  subsection  we  consider  a  criterion  which 
takes  such  a  possibility  into  account.  For  simplicity  we  focus  on  the 
noise-free  case.  The  extension  to  include  noise  as  in  the  previous  subsection 
is  straightforward. 

The  specific  problem  we  consider  is  the  choice  of  parity  checks  for  the 
robust  detection  of  a  particular  failure  mode.  We  assume  that  the  unfailed 
model  of  the  system  is 

x(k+l)  =  A  (n)  x(k)  (65) 

u 

y (k)  =  C  (p)  x (k)  (66) 

u 

while  if  the  failure  has  occurred  the  model  is 

x  (k+1)  =  Af(n)  x  (k)  (67) 

y (k)  =  Cf(n)  x (k)  (68) 

For  example,  if  we  return  to  the  simple  case  y0  =  ay  ,  then  under  unfailed 
conditions  one  might  have 
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a  <  a  <  a  (69) 

u  —  —  u 

while  after  a  failure 

a  <_  a  £  (70) 

This  is  illustrated  pictorially  in  Figure  3  In  this  case  one  would  like  to 

choose  the  line  G  onto  which  one  projects  so  that  one  gets  a  small  value  if 

no  failure  has  occurred  and  a  large  value  if  a  failure  occurs.  That  is, 

we  would  like  G  to  be  "as  orthoaonal  as  Dossible"  to  Z  (n)  and  "as  Darallel 

u 

as  possible"  to  Z -(h). 

Returning  to  the  general  problem,  we  again  assume  that  h  takes  on  one 

of  a  finite  set  of  oossible  values,  and  we  let  C  .  and  C,.  denote  the  counter- 

ui  fi 

parts  of  Cb  in  (44)  for  the  unfailed  and  failed  models,  respectively.  What 
we  now  have  is  a  tradeoff.  Specifically,  we  would  like  to  make  G  as 
small  as  possible  for  all  i  and  to  make  G  as  large  as  possible.  A  natural 

criterion  which  reflects  these  objectives  is 


'<•>  ■  j&i  ui;"c»;aiiF 


'|C,i  <='> 


If  we  define  the  matrix 


H  =  [C  .  C  ‘ - C  .  C  ,  .  C  .  C  ] 

fl.  f 2 .  .  fL  .  ul .  u2 .  .  uL 


M  columns 


columns 


J(s)  =  min  tr  '’g'HSH‘g} 
G’G=I 
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where 


M  M 
1  2 

-I  •  0  M, 

0  •  i  M, 

< 


It  is  straightforward  (see  [3})  to  show  that  a  minor  modification  of  the 
result  in  [17]  leads  to  the  following  solution.  We  perform  an  eiaenvector- 
eigenvalue  analysis  on  the  matrix 

HSH'  =  U  A  U'  (75) 


where  U'U  =  I  and 


*1  0 
A  = 

0  '  A. 


with  A  <  A  <...<  A  and  V  =  [u, Then  the  optimum  choice  of 
l  ~  2  —  —  N  1..N 


G  =  [u  I ...  1 u  ] 

JL  •  •  S 


and  the  corresponding  value  of  (73)  is 


J*(s)  =  i|1  Ai  (78) 

Let  us  make  two  comments  about  this  solution.  The  first  is  that 
upto  of  the  A^  can  be  negative.  In  fact  the  parity  check  based  on  u, 
is  likely  to  have  larger  values  under  failed  rather  than  unfailed  conditions  i_f  and 
only  if  Xi  <  0.  Thus  we  immediately  see  that  the  maximum  number  of  useful 
parity  relations  for  detecting  this  particular  failure  mode  equals  the  number 
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o£  negative  eigenvalues  of  HSH' .  As  a  second  comment,  let  us  contrast 
the  procedure  we  use  here  with  a  singular  viiue  decomposition,  which 
corresponds  essentially  to  performing  an  eigenvector-eigenvalue  analysis  of 
HH' .  First,  assume  that  the  first  K  of  the  are  negative.  Then,  define 

°12=-V  °22*-X2  .  °K2  -  V 


XK+1 . °N  "  \l 


From  (75)  we  have  that 


HSH'  =  UEsEU' 


where 


°1  ° 


0  0N 


Assuming  that  Z  is  nonsingular,  define 
V  =  if1  U'H 

Then  (81) ,  (82)  imply  that  V  is  S-orthogonal 
VSV'  =  S 

and  that  H  has  what  we  call  an  S-singular  value  decomposition 


H  =  UEV 
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V.  An  Example 

We  consider  here  an  example,  adapted  from  [11],  representing  the  linearized 
dynamics  of  a  three  machine  power  system.  The  continuous  dynamics  of  this 
5th-order  system  are 


X  (t) 

*  Fx(t) 

(85a) 

y(t) 

-  Cx(t) 

(85b) 

x'  (t) 

*  [A<u  ,A5  ,A <u  ,A6.,Aojj] 
r  c  c  d  d 

(86) 

with  Aoi  ,  Aai  and  A £»).  being  the  relative  angular  velocities  of  the  generator 
red 

shafts  with  respect  to  a  reference  and  A6  and  A6  .  the  relative  angles. 

c  d 

The  F  matrix  in  (85)  is 


fll 

.00756 

.00486 

.00733 

-.00181 

0 

0 

377 

0 

0 

.0122 

f32 

f33 

.0304 

-.00454 

0 

0 

0 

0 

377 

-.292 

.163 

-.0292 

f54 

f  55  _ 

where  f  ,  f^3  and  f^5  are  the  damping  factors  whose  values  are  in  the 
range  from  -.15  to  -.2,  and  f 32  and  f^4  are  sprina  coefficients  whose 
values  are  not  known  precisely  and  can  change  from  - . 1  to  - . 4 .  The 
constant  value  377  in  F  comes  from  the  angular  frequency  of  60  Hz. 

We  consider  two  C  matrices. 


C 


1 


10  0  0 
0  0  10 
0  0  0  1 


1  0  0  0  0 
0  0  10  0 
0  10  0  0 


(88) 


representing  alternative  sensor  complements. 

In  order  to  apply  our  results  we  used  a  discretized  version  of  (85). 
Because  the  fastest  angular  frequency  in  any  mode  of  this  system  is 
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approximately  6.09  [11],  we  choose  a  samplinq  interval  of  0.25s  which  is  roughly 
1/4  the  period  of  the  fast  mode. 

As  we  have  indicated  we  assume  that  model  uncertainties  appear 
only  in  the  elements  f  ,  f  ,  f  ,  f 32  and  f  of  F.  To  apply  our 
methods  we  must  first  discretize  the  uncertainties,  and  we  do  this  by  choosing 
several  "extreme  points".  Specifically,  we  assume  that  the  parameters  assume 
one  of  the  three  sets  of  values  listed  below: 


fll 

f  32 

f  33 

f  54 

f  55 

i  = 

i 

-.2 

-.1 

-.2 

-.1 

-.2 

i  = 

2 

-.15 

-.4 

-.15 

-.4 

-.15 

i  * 

3 

-.15 

-.2 

-.15 

-.2 

-.15 

We  applied  the  results  of  Subsection  4.1  to  four  cases: 


case 

1 

p=6. 

c=c1. 

case 

2 

P-6 , 

c=c2. 

case 

3 

p=4. 

cc1. 

case 

4 

P=4, 

_  2 
C=C  . 

and  the  results  are  depicted  in  Figure  4,  In  this  figure  we  plot  J*(s) 
in  (47)  versus  s  for  each  of  the  four  cases.  This  illustrates  how  our  method 
can  be  used  to  compare  different  sensor  configurations  (choices  of  C)  and 
different  length  data  windows  (p) .  For  this  example  Case  1  is  the  superior 
one  since  its  curve  lies  below  the  others.  Note  also  that  each  individual 
curve  also  provides  a  useful  visualization  of  the  effective  level  of  redundancy 
in  the  system.  If  the  curve  has  a  dramatic  "knee"  (as  it  does  in  each  of 


se  3  8t4 

( 

xase  1 


i 


s 


5 
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the  four  oases  in  the  figure) ,  one  obtains  a  clear  indication  of  the  number 

of  independent  parity  checks  that  can  be  made  reliably.  Note  that  the  incre- 

2 

mental  change  J*(s+1)  -  J*  (s)  is  precisely  0^  ,  so  that  what  we  are  seeinq 

in  the  figure  is  a  sharp  increase  in  the  magnitude  of  the  sinaular  values 
for  values  of  s  beyond  the  knee. 


VI.  Conclusions 


In  this  paper  we  have  developed  a  series  of  methods  for  determining 
robust  parity  relations  for  failure  detection  in  dynamic  systems.  These 
methods  build  on  the  geometric  interpretation  of  parity  checks  as  orthogonal 
projections  of  windows  of  observations  onto  subspaces  which  are  as  orthogonal 
as  possible  to  the  observation  sequence  given  the  presence  of  model  uncertainties 
and  noise.  We  also  consider  modifications  of  criteria  of  this  type  in  order 
to  take  into  account  possible  differences  in  scaling  among  the  variables  of 
the  system  and  the  choice  of  parity  checks  for  the  detection  of  particular 
failure  model.  In  each  of  tho  cases  we  consider  we  find  that  a  sinole  singular 
value  decomposition  (or  in  the  case  of  Section  4.3,  a  variation  thereof)  produces 
a  complete  sequence  of  orthogonal  parity  relations  ordered  in  terms  of  a 
meaningful  measure  of  robustness.  This  allows  us  to  determine  the  level  of 
robust  redundancy  in  a  system  in  an  extremely  efficient  manner  and  to  define 
those  relations  which  can  then  be  used  as  the  basis  for  designing  robust 
detection  rules. 


i 

I 
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Appendix 

A.O  Singular  Value  Decompositions 

Let  F  be  an  NxM  matrix  and  for  the  sake  of  this  discussion  assume 

N  <  M.  The  singular  values  a  ,...,0  are  the  square  roots  of  the  eigenvalues 

1  N 

of  the  matrix  FF',  and  the  largest  of  these  0(F)  is  precisely  equal  to  the 

MAa 

matrix  norm 

1  /'y 

sup  (x'F'Fx)  7  =  ||  F  ||  =  i!  F •  ||  =  sup  Cx'FF'x) 

2  2  ;i  II 


The  singular  value  decomposition  of  F  is  then  of  the  form 

F  =  U  E  V  (A. 2) 
where  U  is  an  NxN  orthogonal  matrix  (i.e.  its  columns  are  orthonormal) ,  V  is 
an  MxM  orthonormal  matrix,  and 


(A. 3) 


Here  the  columns  of  U, (u, , . . . ,u  ),  are  known  as  the  left  sinaular  vectors 

1  N  -  - * -  - 

of  F  and  the  first  N  rows  of  V,  (v,  v.  ' ),  are  known  as  the  riaht  inaular 

1  N  - “ -  - " - 

vector  of  F.  From  (A. 2)  we  see  that 
N 

F  =  . Z  a .  u.  V.  '  (A. 4) 

1=1  111 


(A .  1 


For  more  on  singular  value  decompositions,  see  [14,  15]. 
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A.l  Singular  Values  and  the  Anqle  Between  Subspaces 

Let  H  and  M  denote  both  subsDaces  and  matrices  whose  columns  form 


orthonormal  bases  for  the  correspondina  spaces.  In  general  the  orthogonal 
projection  onto  the  range  of  a  matrix  A  is  given  by  A(A'A)  ''‘A1.  Thus,  the 
orthogonal  projection  onto  M  is  given  by 


P  =  M(M’M)  1  M’  =  MM* 
M 


(A. 5) 


Consequently 


2  2 
i  Pj^y  i I  =  i  1  MM ’ y  !i 


(A. 6) 


Furthermore,  any  y  €  H  can  be  written  as  y  =  Hx.  Since  H'H  =  I,  i|y  ,, 
and  thus 


sup  j[  P  y  j|  =  sup  ||MM'Hx  =  sup  x'H’MM'MM'Hx 


,yeH 

iiy  il  =1 


i!  x 


11=1 


X  =  1 


sup 

II  X  II  =  1 


x'H'MM'Hx  =  a^x  (M'H)  =  0^  (H'M) 


A. 2  Singular  Value  Decompositions  and  Optimum  Parity  Checks 

Consider  the  problem  of  choosing  an  Nxs  matrix  G  to  minimize 


j(s)  =  i|1  ||h.g||f 


(A. 8) 


subject  to  the  constrant  that  G'G  =  I.  Note  first  that 


J(s)  =  II  H'G  |l  =  tr  (G'HH'G) 

r 


(A. 9) 


where 


h  =  [h  :  h_: . . . :  ht  : 

X  •  t  •  a  i-J 


(A. 10) 


(A 


- 
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We  assume  that  H  has  more  columns  than  rows.  Let  H  have  the  singular  value 
decomposition 

H  =  U  I  V  (A. 11) 

2  2 

with  I  as  in  (A. 3)  with  a,  <  O’  ~  0*  and  with 

1  —  2  —  —  N 

u  =  [ui:  u2.:---:  V 


We  now  show  that  the  minimum  value  of  J(s)  is 
s  2 

J*(s)  =  .r,  a . 

1=1  1 

and  the  optimum  choice  of  G  is 

g  =  [u  u  u  ] 

1 .  2.  .  s 


(A. 12) 


(A. 13) 


To  do  this  we  use  the  following  elementary  result  which  is  a  direct  consequence 
of  the  Courant-Fischer  minimax  principle  [3,  14]:  Suppose  that 


11 

A12 

21 

A22 

(A. 14) 


is  nxn,  symmetric,  and  positive  semidefinite .  Suppose  also  that  A  is  mxm 
and  let  A. (A),  A . (A,  )  denote  the  ith  smallest  eigenvalue  of  A  and  A,,, 

i  i  11  11' 

respectively.  Then 


\i  (A)  <  A±  (An)  ,  i  =  1, . . .  ,m 


(A. 15) 


Consider,  then  any  choice  of  G  satisfying  the  constraint  G’G  =  I,  and 
augment  this  matrix  with  N- s  additional  columns  so  that  the  square  matrix 


-42- 


F  =  [Gi  D] 


(A. 16) 


is  orthogonal .  Then 

G'HH'G 

F'HH'F  = 

★ 

Applying  (A. 15)  to  (A. 17)  and  using  both  (A. 9)  and  the  fact  that  F  is 
orthogonal  we  see  that 

s  0  s  s 

i|1  0i“  =  ^  Ai  (HH’)  =  .2^  Ai  (F'HH'F)  £  tr  (G'HH’G)  =  ||h’G||  p  (A. 18) 

From  (A. 14)  we  see  that 

HH'  =  UE'E  U'  (A. 19) 


* 

(A. 17) 

* 


with 


and  from  this  we  see  that  the  inequality  in  (A. 18)  becomes  an  equality  if 
G  is  chosen  as  in  (A. 13)  thereby  proving  our  assertion. 

We  note  that  from  this  analysis  we  can  directly  deduce  that  the  same 
choice  of  G  minimizes  a  variety  of  other  criteria.  For  example,  an  interesting 
one  is 

det  (G’HH'G)  (A. 21) 


which  has  the  interpretation  of  minimizing  the  volume  of  the  projection  of  the 
columns  of  H  onto  the  subspace  G.  The  proof  that  the  same  G  minimizes  (A. 21) 
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is  also  a  straightforward  consequence  of  (A. 15).  Specifically 


det  (G'HH'G) 


s  s 

.n,  A.  (G'HH'G)  >  .11,  X.  (HH') 

1=1  l  —  1=1  l 


.n,  a.4 

1=1  1 


(A. 22) 


with  equality  resulting  once  again  if  G  is  taken  as  in  (A. 13). 


